K-Means for noise-insensitive multi-dimensional feature learning
نویسندگان
چکیده
Many measurement modalities which perform imaging by probing an object pixel-by-pixel, such as via Photoacoustic Microscopy, produce a multi-dimensional feature (typically time-domain signal) at each pixel. In principle, the many degrees of freedom in signal would admit possibility significant multi-modal information being implicitly present, much more than single scalar “brightness”, regarding underlying targets observed. However, measured is neither weighted-sum basis functions (such principal components) nor one set prototypes (K-means), has motivated novel clustering method proposed here. Signals are clustered based on their shape, but not amplitude, angular distance, and centroids calculated direction maximal intra-cluster variance, resulting algorithm capable learning (signal shapes) that related to underlying, albeit unknown, target characteristics scalable noise-robust manner.
منابع مشابه
Learning Feature Representations with K-Means
Many algorithms are available to learn deep hierarchies of features from unlabeled data, especially images. In many cases, these algorithms involve multi-layered networks of features (e.g., neural networks) that are sometimes tricky to train and tune and are difficult to scale up to many machines effectively. Recently, it has been found that K-means clustering can be used as a fast alternative ...
متن کاملK-means-based Feature Learning for Protein Sequence Classification
Protein sequence classification has been a major challenge in bioinformatics and related fields for some time and remains so today. Due to the complexity and volume of protein data, algorithmic techniques such as sequence alignment are often unsuitable due to time and memory constraints. Heuristic methods based on machine learning are the dominant technique for classifying large sets of protein...
متن کاملLearning the k in k-means
When clustering a dataset, the right number k of clusters to use is often not obvious, and choosing k automatically is a hard algorithmic problem. In this paper we present an improved algorithm for learning k while clustering. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a...
متن کاملK-means Clustering with Feature Hashing
One of the major problems of K-means is that one must use dense vectors for its centroids, and therefore it is infeasible to store such huge vectors in memory when the feature space is high-dimensional. We address this issue by using feature hashing (Weinberger et al., 2009), a dimension-reduction technique, which can reduce the size of dense vectors while retaining sparsity of sparse vectors. ...
متن کاملHierarchical k-Means for Unsupervised Learning
In this paper we investigate how to accelerate k-means based unsupervised learning algorithms with hierarchical k-means. We show that hierarchical k-means significantly speeds up k-means based learning approaches in both the training and query phases at minimal cost to test accuracy. This speedup allows for much larger numbers of centroids to be used, which in turn leads to much better learning...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2023
ISSN: ['1872-7344', '0167-8655']
DOI: https://doi.org/10.1016/j.patrec.2023.04.009